In this file, we first describe the data used in our paper (DATA INDFORMATION) and then we present the programs (STATA PROGRAMS).

--------------------------------------------------------------------------------------------------------------------
DATA INFORMATION
----------------

The data needed to replicate our results consist of two Stata tables:
  - inflow.dta
  - uv.dta

These two tables are produced by Stata programs (described in the STATA PROGRAMS section below) from two data sources:
  
  - data from the BMO survey, directly downloadable from the Review's website (bmo.data and rome_bmo.dta)
  
  - administrative data sets (from the FNA) directly available from the French national employment office "Pole Emploi".


We now explain in detail how to obtain the administrative data sets from Pole Emploi in order to produce these two tables. We then describe the content of these two tables.

--------------------------------------------------------------------------------------------------------------------
Access
------

Distribution of some data used in this paper requires permission from the relevant authorities, in this case the statistical services of Pole Emploi (the French Public Employment Office). Pole Emploi confirmed to us that the data used in our paper can be made available to reseachers for replication purposes. In their request researchers should make explicit reference to the replication of our article to ensure proper access to the data.

We are delighted to help researchers as much as possible with the application process.

Requests should be made to Pole Emploi, by contacting St�phane Ducatez:
  St�phane Ducatez
  Directeur des statistiques, des �tudes et de l'�valuation
  P�le Emploi
  1, avenue du Docteur-Gley
  75020 Paris
  France
  E-mail: stephane.ducatez@pole-emploi.fr

Reasearchers should ask for the eight quarterly extractions of the "histo40" file, for the years 2006 and 2007. The "histo40" file is a subset of the FNA file mentioned in the article.

Those extractions are in the SAS format. They must be converted into the STATA format, and be renamed this way (examples): 
  - "name of the extraction for Q1 2006" => "export0603"; 
  - "name of the extraction for Q2 2007" => "export0706";
  - "name of the extraction for Q4 2007" => "export0712";

The names "export----" appear in the program "prog_mk0_fna_q_h" described below.

The resulting stata tables are those we refer to by administrative datasets from Pole Emploi and should be stored in a data/ folder

--------------------------------------------------------------------------------------------------------------------
Description
-----------

inflow.dta and uv.dta have been constructed by us from two sets of data, as we explain below.

inflow.dta is a data set which contains one observation per unemployment spell starting in 2002 or 2004 and is the main table used for estimation.

inflow.dta has been created from anonymized extractions from the Fichier National des Assedic (FNA). There were 8 extractions (called export0603.dta, export0606.dta, etc.), one for each quarter in two consecutive years (2006 and 2007).

The FNA collects information on all unemployment spells that took place in France since 1990. Each of the 8 extracts contains 2.5% of the FNA at the time of extraction (there may be duplicate observations across extracts, we account for this).

We have explained above, in the data access section above, how to obtain these extractions from Pole Emploi.

The creation of inflow.dta also involved the use of a table rome_bmo.dta, which is essentially a matrix that allows one to standardize the occupation codes between the FNA extracts and the BMO table. This table is made available on the Review's website.

uv.dta is a data set which contains unemployment and job vacancy stocks as well as labor market tightness for each occupation/region/time period (2002 and 2004)

uv.dta has been created from the FNA extracts and another table containing information from the BMO survey (``Enquete Besoin de Main d'Oeuvre''). This survey is conducted by P�le Emploi and collects predicted job openings at the local level (occupations, regions and, after 2004, departments). The data from this survey are made available from the Review's website (bmo.dta).

To construct the inflow.dta and uv.dta data sets from the original data, run prog_mk0_fna_q_h.do, prog_mk1_small.do, prog_mk2_inflow.do and prog_mk3_markets.do, in this order. These programs are presented below (after the dictionary of variables). The administrative data sets and the ones created by these programs are stored in a data/ folder.

--------------------------------------------------------------------------------------------------------------------
Variables
---------

The variables in inflow.dta are:

datedeb         date when unemployment spell starts                     
datefin         date when unemployment spell ends
dtdebfor1       date when training spell starts                      
dtfinfor1       date when training spell ends
dep             department                 
sortiemp        exit to employment
X_refwage       reference wage                 
X_ub            unemployment benefits                
X_duraff        length of affiliation to unemployment benefit system                 
X_male          male dummy                 
regionU         region                   
y0              year when spell started
m0              month when spell started            
y_id2004        year dummy                  
occup           occupation                    
X_age           age                  
X_age2          age squared

X_m2            month dummy (2 for February, 3 for March, etc)
X_m3            month dummy 
X_m4            month dummy 
X_m5            month dummy 
X_m6            month dummy 
X_m7            month dummy 
X_m8            month dummy 
X_m9            month dummy 
X_m10           month dummy 
X_m11           month dummy 
X_m12           month dummy 
Xh_nus2         number of unemployment spells in (y0 -2 years, y0)                  
Xh_duru2        time spent unemployed in (y0 -2 years, y0)                  
Xh_durZ2        time spent in training in (y0 -2 years, y0)                  
Xh_nus72        number of unemployment spells in (y0-7 years, y0-2)                  
Xh_duru72       time spent unemployed in (y0-7 years, y0-2)                  
Xh_durZ72       time spent in training in (y0-7 years, y0-2)

The variables in uv.dta are:

dep             department                
regionU         region                   
occup           occupation
y_id2004        year dummy                  
zone            geographical zone (merging some departments, not used in paper)                   
u_regionU       unemployment in region                  
u_zone          unemployment in zone (not used in paper)
v_regionU       vacancies in region                  
v_zone          vacancies in zone (not used in paper)                 
th_regionU      tightness (v/u) in region                  
th_zone         tightness (v/u) in zone (not used in paper)


--------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------
STATA PROGRAMS
--------------

These programs run on Stata 11 and more recent versions

We wrote detailed comments in the stata do-files to explain what each section of the program does.

--------------------------------------------------------------------------------------------------------------------
data construction programs
--------------------------

prog_mk0_fna_q_h
prog_mk1_small
prog_mk2_inflow
prog_mk3_markets

These programs construct the two tables used for estimation, infow.dta and uv.dta, from the original data sets (see the data description section above).

These programs should be run in the order shown above (mk0 then mk1, etc.)

All tables are taken from/saved in a data/ folder.

--------------------------------------------------------------------------------------------------------------------
estimation programs
-------------------

prog_estim
prog_estim_3c
prog_estim_3f

The programs run all the estimations used in the paper.

Run the data construction programs above before running the estimation programs

Most specifications are in prog_estim. The two exceptions are the specifications considered in Figure 3c and 3f. These are respectively obtained with prog_estim_3c and prog_estim_3f.

These programs make use of two tables, inflow.dta and uv.dta (produced by the data construction programs), which should be stored in a data/ folder.

These programs run the two-step estimation method discussed in the paper, with 500 bootstrap iterations.

Estimation results are stored in tables (in the data/ folder) named boot_*.dta (where * denotes a given specification). For instance benchmark results are in data/boot_bench.dta

--------------------------------------------------------------------------------------------------------------------
graph programs
--------------

prog_mk_graphs

This program produces all the figures in the paper.

Run the estimation programs above before producing the graphs.

To get one of the Figures 3b-3h, just uncomment the corresponding line in 44-50.

--------------------------------------------------------------------------------------------------------------------